Knowledge Discovery in Academic Registrar Data Bases using Source Mining: Data and Text
نویسندگان
چکیده
In this paper we describe a knowledge-based system for extracting knowledge from academic and registrar databases using source mining, where the sources are data or text. Other sources not included in this research are image, sound or gestures. Patterns of student behaviour were obtained by examining data from student attributes such as city of birth, scholarship needs, field of knowledge and major, student gender and other attributes, for various undergraduate academic programs offered by the campus of Tecnológico de Monterrey university system across the country. These patterns proved useful in predicting student enrolment and designing advertising campaigns. We use text mining techniques to match and compare course description from universities with which we have student ex-change programs for course revalidation. Equivalence of 64 courses were obtained by using text mining techniques for matching course descriptions for universities like Michigan State, Carnegie Mellon and New Mexico State. These equivalences helped the International Programs Office in developing course revalidation. Data mining techniques employed include C4.5 decision tree learning and feed-forward neural networks as implemented in the SIPINA intelligent environment (Sipina Research). Text mining techniques utilized are based on statistical and syntactic-semantic analysis and include Clasi-Tex (Clasitex), IBM Intelligent Miner for Text (IntelligentMiner), Text Roller (TextRoller), and Free Text Technologies Master Text
منابع مشابه
ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متنکاوی در حوزه یادگیری الکترونیکی
As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملData Mining: A Novel Outlook to Explore Knowledge in Health and Medical Sciences
Today medical and Healthcare industry generate loads of diverse data about patients, disease diagnosis, prognosis, management, hospitals’ resources, electronic patient health records, medical devices and etc. Using the most efficient processing and analyzing method for knowledge extraction is a key point to cost-saving in clinical decision making. Data mining, sometimes called data or knowledge...
متن کاملSoft computing data mining
Soft computing is a consortium of methodologies, (like fuzzy logic, neural networks, genetic algorithms, rough sets), that works synergistically and provides , in one form or another, flexible information processing capabilities for handling real life problems. Its aim is to exploit the tolerance for imprecision, uncertainty, approximate reasoning and partial truth in order to achieve tractabil...
متن کاملبررسی کاربردهای داده کاوی در نظام سلامت
Introduction: Extensive amounts of data stored in medical databases require the development of specialized tools for accessing the data, data analysis, knowledge discovery, and the effective use of the data. Data mining is one of the most important methods. The article sketches the used Data Mining techniques, and illustrates their applicability to medical diagnostic and prognostic problems. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006